-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ACFL23 Instruction Support #425
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code all looks good. Though, I'm not sure about the infinite loop checker. If I've missed a discussion about this then please ignore me. But if it is being added as a work around for problems encountered because of erroneous configs or broken logic, shouldn't we be fixing the causes not the symptoms? Erroneous configs are kind of a user error, but we could update the documentation to help them avoid this, and if there is broken logic in SimEng we should be fixing it not plastering over the resulting problem. I suppose I can see the value of this type of check in debug mode to flag a problem to the user if they are running into issues in release, but it seems like unnecessary overhead for Release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most bits look good. Comments mainly about adhearing to the project's style and some confusion on SVE helpers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looking pretty good - just a few changes needed and some pedantic comments on comments 😅
src/lib/pipeline/ReorderBuffer.cc
Outdated
std::cerr << "[SimEng:ReorderBuffer] Infinite loop detected in rob " | ||
"commit at instruction address " | ||
<< std::hex << uop->getInstructionAddress() << std::dec | ||
<< " (" << uop->getMicroOpIndex() << ")." << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats the rational for printing the Micro-op index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may give additional context to the user what exactly is stuck at the head of the ROB if the instruction is uopd. I have updated the comment generally, though we should have a discussion offline on what exactly we want to print out in one of these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following offline discussion, we agree to keep this message in release mode, but add more detail s.t. the user is aware of why this is being triggered, what's triggering it, and what to do to resolve the issue. The Micro-op index in particular just adds additional verbosity so remains in the message.
@@ -1080,7 +1080,7 @@ TEST_P(Syscall, sched_getaffinity) { | |||
)"); | |||
EXPECT_EQ(getGeneralRegister<int64_t>(21), -1); | |||
EXPECT_EQ(getGeneralRegister<int64_t>(22), -1); | |||
EXPECT_EQ(getGeneralRegister<int64_t>(23), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What has caused this to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below
stateChange = {ChangeType::REPLACEMENT, {R0}, {retval}}; | ||
stateChange.memoryAddresses.push_back({mask, 1}); | ||
uint64_t retval = static_cast<uint64_t>(bitmask); | ||
stateChange = {ChangeType::REPLACEMENT, {R0}, {sizeof(retval)}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The man page for sched_getaffinity
states that the function returns 0 on success and -1 on failure. This seems to be returning the size of a uint64_t which will always be 8. I think this is incorrect.
What I think you have done is update the value being set in memory correctly on 434 (updating the size). But also updated the value returned to the program to be the size also on 433. Depending on the behaviour we want, 433 should be updated potentially in the way it was done previously i.e. set to 0 if pid == 0
and -1 otherwise.
What was the reason for the update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is worth @jj16791 investigating as it was his find/fix so he will know more than I do on the issue.
The reason given at the time was:
The assert you were triggering was KMP_ASSERT(__kmp_avail_proc == __kmp_topology->get_num_hw_threads());. Newer LLVM OMP runtimes require the affinity mask to be at least 8 bytes in length otherwise it will read the number of available cores out as 0 due to some casting. The affinity mask we were returning was 1 byte in length hence the assert triggered as __kmp_avail_proc was 0. Figured it out from a combination of isolating the instructions run leading up to this assert and then from GodBolt/SimEng figuring out why our mask was being converted to 0 procs available
I've been testing using a STREAM binary (with OpenMP support) compiled with ACFL23. With the current fix, this works. Removing the sizeof
on 433 means that this fails. I do agree though that the current implementation doesn't line up with what I'd expect should work.
…instructions/helpers from neoverse-v2 branch.
…xed a few metadata issues
…Updated comment for infinite loop detector
62561bc
to
c9f708b
Compare
Merging work done towards enabling support for a few codes for ACFL 23, namely STREAM, Minibude, Cloverleaf, Tealeaf, and Minisweep.
This PR is mostly made up of added instruction support. 58 instructions have been added, with 24 unique instructions with the remainder being variants. Most instructions are SVE, with some NEON added.
An additional feature of "infinite loop checking" has been added. This adds a counter in the ROB which throws an error if the same address has been at the head of the ROB for a very long time. This catches a few errors previously found where an erroneous config or broken logic can cause SimEng to get caught in a loop and sometimes eventually hit OOM.
This also fixes an OpenMP bug that has previously popped up for ACFL 23 support, work that Jack had done in a separate branch.
Tests are still being added, and the new group tests need to be added for all instructions. The PR will leave draft stage once all tests have been added.
Here are a list of instructions added: